Intoduction
At the end of 2019, a novel coronavirus was identified as the cause of a cluster of pneumonia cases in China. It rapidly spread, resulting in an epidemic throughout the world. In February 2020, the World Health Organization designated the disease COVID-19, which stands for coronavirus disease 2019. This report is designed to explore the association between COVID-19 death and state, sex, age groups.
Methods
The data was obstained from the US CDC, Centers of Disease Control and Prevention. The date include deaths involving coronavirus disease 2019 (COVID-19), pneumonia, and influenza reported to NCHS by sex and age group and state.
We calculate the proportionate mortality ratio due to COVID-19 by using the COVID-19 deaths/total deaths.
library(data.table)
library(dtplyr)
library(dplyr)
library(leaflet)
library(tidyverse)
library(ggplot2)#Read file
covid = fread("data/Provisional_COVID-19_Death_Counts_by_Sex__Age__and_State.csv")
#Calculate proportionate mortality ratio due to COVID-19
covid$PMR=covid$`COVID-19 Deaths`/covid$`Total Deaths`Compare the proportionate mortality ratio among states.
Here, we create a map of proportionate mortality ratio among the states.
#construct simple table of Covid data related to States
covid_state=covid[which(covid$Sex == "All Sexes" & covid$State != "United States" & covid$`Age group`=="All Ages"),]
covid_simp=subset(covid_state, select = c("State","COVID-19 Deaths","Total Deaths","PMR"))
covid_simp= covid_simp[order(-covid_simp$`Total Deaths`),]
knitr::kable(head(covid_simp))| State | COVID-19 Deaths | Total Deaths | PMR |
|---|---|---|---|
| California | 15534 | 202764 | 0.0766112 |
| Florida | 14828 | 162904 | 0.0910229 |
| Texas | 16560 | 160010 | 0.1034935 |
| Pennsylvania | 8402 | 98410 | 0.0853775 |
| Ohio | 4522 | 86595 | 0.0522201 |
| Illinois | 7930 | 82908 | 0.0956482 |
| State | COVID-19 Deaths | Total Deaths | PMR |
|---|---|---|---|
| New York City | 20763 | 62910 | 0.3300429 |
| New Jersey | 14362 | 66941 | 0.2145471 |
| Connecticut | 4437 | 23033 | 0.1926367 |
| Massachusetts | 8094 | 47456 | 0.1705580 |
| District of Columbia | 760 | 4896 | 0.1552288 |
| New York | 11665 | 81391 | 0.1433205 |
According to the table, we can see that California has the most people died from COVID-19, but it has a low proportionate mortality ratio. This means that California has a large number due to the large population. New York City has the highest proportionate mortality ratio due to COVID. This may caused by small living space per capita, since New York City has a large population but small living space.
#covid_state$State[which(covid_state$State %in% states$NAME ==F)]
#Combine New York City with New York State.
covid_state[33,7:12]=covid_state[33,7:12]+covid_state[34,7:12]
covid_state=covid_state[-34, ]
covid_state$PMR=covid_state$`COVID-19 Deaths`/covid_state$`Total Deaths`
colnames(covid_state)[4]="NAME"
mergedata=merge(x = states, y = covid_state, by = "NAME", all.x = TRUE)
#Construct PMR map
pal <- colorBin("YlOrRd", domain = mergedata$PMR)
leaflet() %>%
addProviderTiles("CartoDB.Positron") %>%
setView(-98.483330, 38.712046, zoom = 4) %>%
addPolygons(
data=mergedata,
fillColor = ~pal(mergedata$PMR),
fillOpacity = 0.7,
weight = 0.2,
smoothFactor = 0.2
)%>%
addLegend(pal = pal,
values = mergedata$PMR,
position = "bottomright",
title = "PMR")Here we can see that the New York State has the highest proportionate mortality ratio due to COVID-19. It means that New York City has the most people died because of COVID-19.
#Construct Total Deaths map
pal2 <- colorBin("YlOrRd", domain = mergedata$`COVID-19 Deaths`)
leaflet() %>%
addProviderTiles("CartoDB.Positron") %>%
setView(-98.483330, 38.712046, zoom = 4) %>%
addPolygons(
data=mergedata,
fillColor = ~pal2(mergedata$`COVID-19 Deaths`),
fillOpacity = 0.7,
weight = 0.2,
smoothFactor = 0.2
)%>%
addLegend(pal = pal2,
values = mergedata$`COVID-19 Deaths`,
position = "bottomright",
title = "Total Deaths")